Numpy array to Tensor
Pandas DataFrame to Tensor
| gender | scores | |
|---|---|---|
| 0 | male | 40 |
| 1 | female | 50 |
| 2 | male | 46 |
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-11-55a1ef6d4753> in <cell line: 0>() ----> 1 torch.tensor(df.values) TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint64, uint32, uint16, uint8, and bool.
| scores | gender_female | gender_male | |
|---|---|---|---|
| 0 | 40 | 0.0 | 1.0 |
| 1 | 50 | 1.0 | 0.0 |
| 2 | 46 | 0.0 | 1.0 |
AutoGrad
\[ q_1(z) = z^2\\ q_2(z) = z^3\\ q_3(z) = e^z\\ p(z) = \dfrac{q_1}{q_2} + q_3\\ = \dfrac{z^2}{z^3} + e^z = \dfrac{1}{z} + e^z\\ \dfrac{\partial p(z)}{\partial z} = \dfrac{-1}{z^2} + e^z \]
derivation along multiple path
\(\dfrac{\partial p(z)}{\partial z} = \dfrac{\partial p}{\partial q1}\dfrac{\partial q1}{\partial z}+ \dfrac{\partial p}{\partial q2}\dfrac{\partial q2}{\partial z}+\dfrac{\partial p}{\partial q3}\dfrac{\partial q3}{\partial z}\)
= \((\dfrac{1}{q_2}) * 2z + (\dfrac{-q_1}{q_2^2}) * 3z^2 + (1) * e^z\)
= \(\dfrac{2}{z^2} - \dfrac{3}{z^2} + e^z\)
= \(\dfrac{-1}{z^2} + e^z\)
tensor(7.8891, grad_fn=<AddBackward0>)
UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
c= torch.tensor(z**2, requires_grad=True)
tensor(4., requires_grad=True)
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-158-5afd61dd1bf3> in <cell line: 0>() 1 # derivation 2 p = (q1/q2) + q3 ----> 3 p.backward() 4 z.grad /usr/local/lib/python3.11/dist-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs) 624 inputs=inputs, 625 ) --> 626 torch.autograd.backward( 627 self, gradient, retain_graph, create_graph, inputs=inputs 628 ) /usr/local/lib/python3.11/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs) 345 # some Python versions print out the first line of a multi-line function 346 # calls in the traceback and some print out the last line --> 347 _engine_run_backward( 348 tensors, 349 grad_tensors_, /usr/local/lib/python3.11/dist-packages/torch/autograd/graph.py in _engine_run_backward(t_outputs, *args, **kwargs) 821 unregister_hooks = _register_logging_hooks_on_whole_graph(t_outputs) 822 try: --> 823 return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 824 t_outputs, *args, **kwargs 825 ) # Calls into the C++ engine to run the backward pass RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-146-a077ad1754e0> in <cell line: 0>() ----> 1 p.backward() /usr/local/lib/python3.11/dist-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs) 624 inputs=inputs, 625 ) --> 626 torch.autograd.backward( 627 self, gradient, retain_graph, create_graph, inputs=inputs 628 ) /usr/local/lib/python3.11/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs) 345 # some Python versions print out the first line of a multi-line function 346 # calls in the traceback and some print out the last line --> 347 _engine_run_backward( 348 tensors, 349 grad_tensors_, /usr/local/lib/python3.11/dist-packages/torch/autograd/graph.py in _engine_run_backward(t_outputs, *args, **kwargs) 821 unregister_hooks = _register_logging_hooks_on_whole_graph(t_outputs) 822 try: --> 823 return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 824 t_outputs, *args, **kwargs 825 ) # Calls into the C++ engine to run the backward pass RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at /pytorch/build/aten/src/ATen/core/TensorBody.h:489.)
p.grad
Collecting torchviz Downloading torchviz-0.0.3-py3-none-any.whl.metadata (2.1 kB) Requirement already satisfied: torch in /usr/local/lib/python3.11/dist-packages (from torchviz) (2.6.0+cu124) Requirement already satisfied: graphviz in /usr/local/lib/python3.11/dist-packages (from torchviz) (0.20.3) Requirement already satisfied: filelock in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (3.18.0) Requirement already satisfied: typing-extensions>=4.10.0 in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (4.13.1) Requirement already satisfied: networkx in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (3.4.2) Requirement already satisfied: jinja2 in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (3.1.6) Requirement already satisfied: fsspec in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (2025.3.2) Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch->torchviz) Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB) Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch->torchviz) Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB) Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch->torchviz) Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB) Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch->torchviz) Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB) Collecting nvidia-cublas-cu12==12.4.5.8 (from torch->torchviz) Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB) Collecting nvidia-cufft-cu12==11.2.1.3 (from torch->torchviz) Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB) Collecting nvidia-curand-cu12==10.3.5.147 (from torch->torchviz) Downloading nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB) Collecting nvidia-cusolver-cu12==11.6.1.9 (from torch->torchviz) Downloading nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB) Collecting nvidia-cusparse-cu12==12.3.1.170 (from torch->torchviz) Downloading nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB) Requirement already satisfied: nvidia-cusparselt-cu12==0.6.2 in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (0.6.2) Requirement already satisfied: nvidia-nccl-cu12==2.21.5 in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (2.21.5) Requirement already satisfied: nvidia-nvtx-cu12==12.4.127 in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (12.4.127) Collecting nvidia-nvjitlink-cu12==12.4.127 (from torch->torchviz) Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB) Requirement already satisfied: triton==3.2.0 in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (3.2.0) Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.11/dist-packages (from torch->torchviz) (1.13.1) Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from sympy==1.13.1->torch->torchviz) (1.3.0) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from jinja2->torch->torchviz) (3.0.2) Downloading torchviz-0.0.3-py3-none-any.whl (5.7 kB) Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl (363.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 363.4/363.4 MB 1.2 MB/s eta 0:00:00 Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (13.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.8/13.8 MB 39.3 MB/s eta 0:00:00 Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (24.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.6/24.6 MB 7.2 MB/s eta 0:00:00 Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (883 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 883.7/883.7 kB 30.6 MB/s eta 0:00:00 Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 807.0 kB/s eta 0:00:00 Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl (211.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.5/211.5 MB 4.8 MB/s eta 0:00:00 Downloading nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl (56.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.3/56.3 MB 12.0 MB/s eta 0:00:00 Downloading nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl (127.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 127.9/127.9 MB 7.4 MB/s eta 0:00:00 Downloading nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl (207.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.5/207.5 MB 5.1 MB/s eta 0:00:00 Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (21.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 85.5 MB/s eta 0:00:00 Installing collected packages: nvidia-nvjitlink-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, nvidia-cusolver-cu12, torchviz Attempting uninstall: nvidia-nvjitlink-cu12 Found existing installation: nvidia-nvjitlink-cu12 12.5.82 Uninstalling nvidia-nvjitlink-cu12-12.5.82: Successfully uninstalled nvidia-nvjitlink-cu12-12.5.82 Attempting uninstall: nvidia-curand-cu12 Found existing installation: nvidia-curand-cu12 10.3.6.82 Uninstalling nvidia-curand-cu12-10.3.6.82: Successfully uninstalled nvidia-curand-cu12-10.3.6.82 Attempting uninstall: nvidia-cufft-cu12 Found existing installation: nvidia-cufft-cu12 11.2.3.61 Uninstalling nvidia-cufft-cu12-11.2.3.61: Successfully uninstalled nvidia-cufft-cu12-11.2.3.61 Attempting uninstall: nvidia-cuda-runtime-cu12 Found existing installation: nvidia-cuda-runtime-cu12 12.5.82 Uninstalling nvidia-cuda-runtime-cu12-12.5.82: Successfully uninstalled nvidia-cuda-runtime-cu12-12.5.82 Attempting uninstall: nvidia-cuda-nvrtc-cu12 Found existing installation: nvidia-cuda-nvrtc-cu12 12.5.82 Uninstalling nvidia-cuda-nvrtc-cu12-12.5.82: Successfully uninstalled nvidia-cuda-nvrtc-cu12-12.5.82 Attempting uninstall: nvidia-cuda-cupti-cu12 Found existing installation: nvidia-cuda-cupti-cu12 12.5.82 Uninstalling nvidia-cuda-cupti-cu12-12.5.82: Successfully uninstalled nvidia-cuda-cupti-cu12-12.5.82 Attempting uninstall: nvidia-cublas-cu12 Found existing installation: nvidia-cublas-cu12 12.5.3.2 Uninstalling nvidia-cublas-cu12-12.5.3.2: Successfully uninstalled nvidia-cublas-cu12-12.5.3.2 Attempting uninstall: nvidia-cusparse-cu12 Found existing installation: nvidia-cusparse-cu12 12.5.1.3 Uninstalling nvidia-cusparse-cu12-12.5.1.3: Successfully uninstalled nvidia-cusparse-cu12-12.5.1.3 Attempting uninstall: nvidia-cudnn-cu12 Found existing installation: nvidia-cudnn-cu12 9.3.0.75 Uninstalling nvidia-cudnn-cu12-9.3.0.75: Successfully uninstalled nvidia-cudnn-cu12-9.3.0.75 Attempting uninstall: nvidia-cusolver-cu12 Found existing installation: nvidia-cusolver-cu12 11.6.3.83 Uninstalling nvidia-cusolver-cu12-11.6.3.83: Successfully uninstalled nvidia-cusolver-cu12-11.6.3.83 Successfully installed nvidia-cublas-cu12-12.4.5.8 nvidia-cuda-cupti-cu12-12.4.127 nvidia-cuda-nvrtc-cu12-12.4.127 nvidia-cuda-runtime-cu12-12.4.127 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.2.1.3 nvidia-curand-cu12-10.3.5.147 nvidia-cusolver-cu12-11.6.1.9 nvidia-cusparse-cu12-12.3.1.170 nvidia-nvjitlink-cu12-12.4.127 torchviz-0.0.3
from IPython.display import IFrame
IFrame(src="https://iitm-pod.slides.com/arunprakash_ai/cs6910-lecture-4/fullscreen#/0/43", width=800, height=600)tensor([[1., 2., 1.]], grad_fn=<MmBackward0>)
tensor([[0.1451, 0.1451, 0.7098]], grad_fn=<SoftmaxBackward0>)
tensor([[-0.0046, -0.0024, -0.0171],
[-0.0000, -0.0000, -0.0000],
[-0.0046, -0.0024, -0.0171]])
tensor([[ 0.0000, -0.0170, -0.0465],
[ 0.0000, -0.0204, -0.0561],
[ 0.0000, -0.0170, -0.0465]])
tensor([[ 0.1324, 0.1324, -0.2648],
[ 0.1324, 0.1324, -0.2648],
[ 0.0980, 0.0980, -0.1959]])
ANN
- Training Pipeline
- Define Model
- for epoch in range(epochs):
- Forward pass
- Loss calculation
- Backward pass
- Parameters update
- Model Evaluation
- Improve Training Pipeline using nn.Module and torch.optim
- nn.Linear
- Activation Functions(nn.ReLU, nn.Sigmoid, nn.Softmax)
- nn.Sequential Container
- Loss Functions (nn.BCELoss, nn.CrossEntropyLoss etc.)
- torch.optim (SGD,ADAM etc)
array([[ 0.03807591, 0.05068012, 0.06169621, 0.02187239, -0.0442235 ,
-0.03482076, -0.04340085, -0.00259226, 0.01990749, -0.01764613],
[-0.00188202, -0.04464164, -0.05147406, -0.02632753, -0.00844872,
-0.01916334, 0.07441156, -0.03949338, -0.06833155, -0.09220405],
[ 0.08529891, 0.05068012, 0.04445121, -0.00567042, -0.04559945,
-0.03419447, -0.03235593, -0.00259226, 0.00286131, -0.02593034],
[-0.08906294, -0.04464164, -0.01159501, -0.03665608, 0.01219057,
0.02499059, -0.03603757, 0.03430886, 0.02268774, -0.00936191],
[ 0.00538306, -0.04464164, -0.03638469, 0.02187239, 0.00393485,
0.01559614, 0.00814208, -0.00259226, -0.03198764, -0.04664087]])
Linear(in_features=10, out_features=4, bias=True)
<function Tensor.size>
array([ 0.03807591, 0.05068012, 0.06169621, 0.02187239, -0.0442235 ,
-0.03482076, -0.04340085, -0.00259226, 0.01990749, -0.01764613])
Sequential(
(0): Linear(in_features=10, out_features=8, bias=True)
(1): Sigmoid()
(2): Linear(in_features=8, out_features=8, bias=True)
(3): Sigmoid()
(4): Linear(in_features=8, out_features=1, bias=True)
)
torch.manual_seed(42)
# define the model
model = nn.Sequential(
nn.Linear(in_features=10, out_features=8),
nn.Sigmoid(),
nn.Linear(in_features=8, out_features=8),
nn.Sigmoid(),
nn.Linear(in_features=10, out_features=1)
)
for epoch in range(3):
# forward pass
y_pred = model.forward(X_train)
# loss computation
loss_func = nn.MSELoss()
loss = loss_func(y_train, y_pred.squeeze())
print(loss)
# backward pass
loss.backward()
#weight updates
learning_rate = 0.01
with torch.no_grad(): # gradient track is off
for param in model.parameters():
param.data -= learning_rate * param.grad # gradient tracking is off
tensor(29624.2070, grad_fn=<MseLossBackward0>)
tensor(28690.6484, grad_fn=<MseLossBackward0>)
tensor(26897.7676, grad_fn=<MseLossBackward0>)
torch.manual_seed(42)
# define the model
model = nn.Sequential(
nn.Linear(in_features=10, out_features=8),
nn.Sigmoid(),
nn.Linear(in_features=8, out_features=8),
nn.Sigmoid(),
nn.Linear(in_features=8, out_features=1)
)
for epoch in range(3):
# forward pass
y_pred = model.forward(X_train)
# loss computation
loss_func = nn.MSELoss()
loss2 = loss_func(y_train, y_pred.squeeze())
print(loss2)
# backward pass
loss2.backward()
#weight updates
learning_rate = 0.01
with torch.no_grad(): # gradient tracking is off
for param in model.parameters():
param.data -= learning_rate * param.grad
model.zero_grad() # i have to make gradients zero at each pass
tensor(29618.6719, grad_fn=<MseLossBackward0>)
tensor(27032.6934, grad_fn=<MseLossBackward0>)
tensor(22253.0332, grad_fn=<MseLossBackward0>)
- we want to write the model as a class
- dropout layer, batchnormalization, optimizers
- evaluation of the model
class myModel(nn.Module):
def __init__(self, in_features):
super().__init__() # inherit the nn.Module
self.network = nn.Sequential(
nn.Linear(in_features=in_features, out_features=8),
nn.Sigmoid(),
nn.Linear(in_features=8, out_features=8),
nn.Sigmoid(),
nn.Linear(in_features=8, out_features=1)
)
def forward(self,X):
return self.network(X)myModel(
(network): Sequential(
(0): Linear(in_features=10, out_features=8, bias=True)
(1): Sigmoid()
(2): Linear(in_features=8, out_features=8, bias=True)
(3): Sigmoid()
(4): Linear(in_features=8, out_features=1, bias=True)
)
)
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Linear-1 [-1, 353, 8] 88
Sigmoid-2 [-1, 353, 8] 0
Linear-3 [-1, 353, 8] 72
Sigmoid-4 [-1, 353, 8] 0
Linear-5 [-1, 353, 1] 9
================================================================
Total params: 169
Trainable params: 169
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 0.09
Params size (MB): 0.00
Estimated Total Size (MB): 0.10
----------------------------------------------------------------
torch.manual_seed(42)
# define the model
model = myModel(X_train.shape[1])
EPOCHS = 3
for epoch in range(EPOCHS):
# forward pass
y_pred = model.forward(X_train)
# loss computation
loss_func = nn.MSELoss()
loss = loss_func(y_train, y_pred.squeeze())
print(loss)
#make gradients zero
model.zero_grad()
# backward pass
loss.backward()
#weight updates
with torch.no_grad(): # gradient tracking is off
for param in model.parameters():
param.data -= learning_rate * param.grad
tensor(29618.6719, grad_fn=<MseLossBackward0>)
tensor(27032.6934, grad_fn=<MseLossBackward0>)
tensor(22253.0332, grad_fn=<MseLossBackward0>)
class myModel(nn.Module):
def __init__(self, in_features):
super().__init__() # inherit the nn.Module
self.network = nn.Sequential(
nn.Linear(in_features=in_features, out_features=8),
nn.Sigmoid(),
nn.Linear(in_features=8, out_features=8),
nn.Sigmoid(),
nn.Linear(in_features=8, out_features=1)
)
def forward(self,X):
return self.network(X)
torch.manual_seed(42)
# define the model
model = myModel(X_train.shape[1])
optimizer = optim.SGD(model.parameters(), lr = 0.01)
EPOCHS = 30
for epoch in range(EPOCHS):
# forward pass
y_pred = model.forward(X_train)
# loss computation
loss_func = nn.MSELoss()
loss = loss_func(y_train, y_pred.squeeze())
print(loss)
#make gradients zero
optimizer.zero_grad()
# backward pass
loss.backward()
#weight updates
optimizer.step()
tensor(29618.6719, grad_fn=<MseLossBackward0>)
tensor(27032.6934, grad_fn=<MseLossBackward0>)
tensor(22253.0332, grad_fn=<MseLossBackward0>)
tensor(17304.5020, grad_fn=<MseLossBackward0>)
tensor(13647.4883, grad_fn=<MseLossBackward0>)
tensor(11171.1689, grad_fn=<MseLossBackward0>)
tensor(9502.8682, grad_fn=<MseLossBackward0>)
tensor(8380.4707, grad_fn=<MseLossBackward0>)
tensor(7625.6182, grad_fn=<MseLossBackward0>)
tensor(7118.0146, grad_fn=<MseLossBackward0>)
tensor(6776.6934, grad_fn=<MseLossBackward0>)
tensor(6547.1890, grad_fn=<MseLossBackward0>)
tensor(6392.8711, grad_fn=<MseLossBackward0>)
tensor(6289.1099, grad_fn=<MseLossBackward0>)
tensor(6219.3398, grad_fn=<MseLossBackward0>)
tensor(6172.4277, grad_fn=<MseLossBackward0>)
tensor(6140.8838, grad_fn=<MseLossBackward0>)
tensor(6119.6729, grad_fn=<MseLossBackward0>)
tensor(6105.4087, grad_fn=<MseLossBackward0>)
tensor(6095.8164, grad_fn=<MseLossBackward0>)
tensor(6089.3652, grad_fn=<MseLossBackward0>)
tensor(6085.0254, grad_fn=<MseLossBackward0>)
tensor(6082.1064, grad_fn=<MseLossBackward0>)
tensor(6080.1416, grad_fn=<MseLossBackward0>)
tensor(6078.8188, grad_fn=<MseLossBackward0>)
tensor(6077.9277, grad_fn=<MseLossBackward0>)
tensor(6077.3267, grad_fn=<MseLossBackward0>)
tensor(6076.9199, grad_fn=<MseLossBackward0>)
tensor(6076.6450, grad_fn=<MseLossBackward0>)
tensor(6076.4580, grad_fn=<MseLossBackward0>)
#@title dropouts and batchnormalization
class myModel(nn.Module):
def __init__(self, in_features):
super().__init__() # inherit the nn.Module
self.network = nn.Sequential(
nn.Linear(in_features=in_features, out_features=8),
nn.BatchNorm1d(8),
nn.Sigmoid(),
nn.Dropout(0.4),
nn.Linear(in_features=8, out_features=8),
nn.BatchNorm1d(8),
nn.Sigmoid(),
nn.Dropout(0.4),
nn.Linear(in_features=8, out_features=1)
)
def forward(self,X):
return self.network(X)
torch.manual_seed(42)
# define the model
model = myModel(X_train.shape[1])
optimizer = optim.SGD(model.parameters(), lr = 0.01)EPOCHS = 30
for epoch in range(EPOCHS):
# forward pass
y_pred = model.forward(X_train)
# loss computation
loss_func = nn.MSELoss()
loss = loss_func(y_train, y_pred.squeeze())
print(loss)
#make gradients zero
optimizer.zero_grad()
# backward pass
loss.backward()
#weight updates
optimizer.step()
tensor(4138.2109, grad_fn=<MseLossBackward0>)
tensor(18139.7559, grad_fn=<MseLossBackward0>)
tensor(13536.3623, grad_fn=<MseLossBackward0>)
tensor(12693.8545, grad_fn=<MseLossBackward0>)
tensor(11923.5791, grad_fn=<MseLossBackward0>)
tensor(11242.9658, grad_fn=<MseLossBackward0>)
tensor(10641.5762, grad_fn=<MseLossBackward0>)
tensor(10110.1895, grad_fn=<MseLossBackward0>)
tensor(9640.6553, grad_fn=<MseLossBackward0>)
tensor(9225.7744, grad_fn=<MseLossBackward0>)
tensor(8859.1865, grad_fn=<MseLossBackward0>)
tensor(8535.2695, grad_fn=<MseLossBackward0>)
tensor(8249.0557, grad_fn=<MseLossBackward0>)
tensor(7996.1577, grad_fn=<MseLossBackward0>)
tensor(7772.6968, grad_fn=<MseLossBackward0>)
tensor(7575.2466, grad_fn=<MseLossBackward0>)
tensor(7400.7812, grad_fn=<MseLossBackward0>)
tensor(7246.6226, grad_fn=<MseLossBackward0>)
tensor(7110.4077, grad_fn=<MseLossBackward0>)
tensor(6990.0488, grad_fn=<MseLossBackward0>)
tensor(6883.6997, grad_fn=<MseLossBackward0>)
tensor(6789.7285, grad_fn=<MseLossBackward0>)
tensor(6706.6982, grad_fn=<MseLossBackward0>)
tensor(6633.3301, grad_fn=<MseLossBackward0>)
tensor(6568.5044, grad_fn=<MseLossBackward0>)
tensor(6511.2217, grad_fn=<MseLossBackward0>)
tensor(6460.6084, grad_fn=<MseLossBackward0>)
tensor(6415.8857, grad_fn=<MseLossBackward0>)
tensor(6376.3696, grad_fn=<MseLossBackward0>)
tensor(6341.4531, grad_fn=<MseLossBackward0>)
# evaluation
training --> batchnorm, dropout -----> update
testing --> batchnorm, dropout ----> freeze
File "<ipython-input-52-e0ad2358e5fb>", line 2 training --> batchnorm, dropout update ^ SyntaxError: invalid syntax
myModel(
(network): Sequential(
(0): Linear(in_features=10, out_features=8, bias=True)
(1): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Sigmoid()
(3): Dropout(p=0.4, inplace=False)
(4): Linear(in_features=8, out_features=8, bias=True)
(5): BatchNorm1d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): Sigmoid()
(7): Dropout(p=0.4, inplace=False)
(8): Linear(in_features=8, out_features=1, bias=True)
)
)
with torch.no_grad():
y_pred = model.forward(X_test)
# loss computation
loss_func = nn.MSELoss()
loss = loss_func(y_test, y_pred.squeeze())
print(loss)tensor(5352.0630)
y_pred = model.forward(X_test)
# loss computation
loss_func = nn.MSELoss()
loss = loss_func(y_test, y_pred.squeeze())
print(loss)tensor(5352.0630, grad_fn=<MseLossBackward0>)
torch.Size([33, 10])
torch.Size([33, 10])
torch.Size([33, 10])
torch.Size([33, 10])
torch.Size([33, 10])
torch.Size([33, 10])
torch.Size([33, 10])
torch.Size([33, 10])
torch.Size([33, 10])
torch.Size([33, 10])
torch.Size([23, 10])
#@title mini-batch SGD
torch.manual_seed(42)
# define the model
model = myModel(X_train.shape[1])
optimizer = optim.SGD(model.parameters(), lr = 0.01)
EPOCHS = 30
model.train()
for epoch in range(EPOCHS):
loss_per_epoch = 0
for features, labels in train_loader:
# forward pass
y_pred = model.forward(features)
# loss computation
loss_func = nn.MSELoss()
loss = loss_func(labels, y_pred.squeeze())
#make gradients zero
optimizer.zero_grad()
# backward pass
loss.backward()
#weight updates
optimizer.step()
loss_per_epoch += loss
print("avg loss per epoch =", loss_per_epoch/X_train.shape[0])
avg loss per epoch = tensor(473.1050, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(229.4338, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(194.4292, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(172.4065, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(200.5581, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(189.5959, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(180.0494, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(171.7957, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(164.0631, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(158.9171, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(154.1806, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(157.7277, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(156.0521, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(158.0447, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(153.1664, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(140.6956, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(155.7105, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(165.5577, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(157.5495, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(127.2960, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(169.2330, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(140.0286, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(148.1679, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(159.4082, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(134.4750, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(132.2636, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(142.8992, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(137.3337, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(135.9836, grad_fn=<DivBackward0>)
avg loss per epoch = tensor(139.2854, grad_fn=<DivBackward0>)